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ABSTRACT 

Planned missions will spatially resolve temperate terrestrial planets from their host star. Although 
reflected light from such a planet encodes information about its surface, it has not been shown how to 
establish surface characteristics of a planet without assuming known surfaces to begin with. We present 
a re-analysis of disk-integrated, time-resolved, multiband photometry of Earth obtained by the Deep 
Impact spacecraft as part of the EPOXI Mission of Opportunity. We extract reflectance spectra of 
clouds, ocean and land without a priori knowledge of the numbers or colors of these surfaces. We show 
that the inverse problem of extracting surface spectra from such data is a novel and extreme instance of 
spectral unmixing, a well-studied problem in remote sensing. Principal component analysis is used to 
determine an appropriate number of model surfaces with which to interpret the data. Shrink-wrapping 
a simplex to the color excursions of the planet yields a conservative estimate of the planet's endmember 
spectra. The resulting surface maps are unphysical, however, requiring negative or larger-than-unity 
surface coverage at certain locations. Our "rotational unmixing" supersedes the endmember analysis 
by simultaneously solving for the surface spectra and their geographical distributions on the planet, 
under the assumption of diffuse reflection and known viewing geometry. We use a Markov Chain 
Monte Carlo to determine best-fit parameters and their uncertainties. The resulting albedo spectra 
are similar to clouds, ocean and land seen through a Rayleigh-scattering atmosphere. This study 
suggests that future direct-imaging efforts could identify and map unknown surfaces and clouds on 
exoplanets. 

Subject headings: methods: data analysis — techniques: photometric — planets and satellites: surfaces 
— planets and satellites: atmospheres — planets and satellites: individual (Earth) 



1. INTRODUCTION 

Next-generation space telescopes will spatially re- 
solve terrestrial exoplanets from their host star 
(jTraub fc Oppenheimeii[2011[ ). The rotational color vari- 
ations of a "pale blue dot" might betr ay the presence o f 
landforms rotating in and out of view (IFord et al.ll200l . 
Ground truth is critical to our understanding of plane- 
tary climate becaus e surface l i quid water is the d cfini- 
tion of habitability (|Abelll993HKasting et al.lll995) and 
long-term habitabi lity may require exposed continents 
(|Abbot et al.l[20T2l) . 

Previous research has shown that rotationa l color vari- 
ations of a planet can yie ld rotation rate (jPalle et al.l 
[200I [Qaklev fc Cash|[200l . coarse pla netary maps un- 
der the assumption of known surfaces (iFuiii et al.l 120101 : 
iFuiii e t al.' '2011': 'Kawahar a fc Fuiiil 12 0101) . single-band 
maps (Kawahara & Fuiii 2 0117 Fuiii fc Kawahara 20121), 
and maps of eigencolors (|Cowan et al.l T2009. 2011). We 
cannot assume, however, that terrestrial exoplanets will 
have the same surface types as Earth, and there is no 
obvious relation between single-band albedo or eigen- 
colors with surface features, so neither of these "exo- 
cartographic" strategies is entirely satisfactory. 

Three methods have been proposed for using orbital 
phase variations to detect surface liquid water on ex- 
oplanets, but they have challenges of their own: ther- 
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mal ine rtia (thermal infrared; 'Gaidos fc Will iaini 120041 : 
iCowan. Voigt fc Abbot 2012), glint (visible -near m- 
frared: iWilliams fc Gaidos 2008; Robins on eTall [20Tol: 
ICowan. Abbot J PVo'mt 2012) and polar ization (visible- 
near infrared; IZuggcr ct al. 201QI, l20Tll) . Significantly, 
these methods leverage variability on the planet's or- 
bital, rather than rotational, timescale. While this makes 
the signal-to-noise requirements less stringent than exo- 
cartography, none of these methods would be able to 
distinguish between partial vs. complete ocean coverage. 

In this Letter, we show how rotational color variations 
can be used to retrieve reflectance spectra of a planet's 
dominant surfaces. Our method does not require a priori 
knowledge of the number of surfaces, let alone their colors 
or geographical distributions. 

2. ANALYSIS 

We re-analyze disk-integrated Earth observations ob- 
tained by the Deep Impact sp acecraft as part of th e 
EPOXI Mission of Opportunity (jLivengood et al.ll20ll . 
Our data consist of hourly measurements of apparent 
albedo. A*, in seven wavebands spanning a single rota- 
tion of Earth (the Earthd time-series). The first and 
last observations occur exactly 24 hours apart but have 
slightly different apparen t albedos because o f day-to-day 
changes in cloud cover (IGoode et al.l I2OOII: IPalle et aij 
l200¥lCowan et al.l 120091 12011D . Soecificallv. given the 
orbital phase near quadrature, the final 6 observations 
probe the same regions as the early part of the time- 
series. 

We "correct" the final 6 observations so that the 
lightcurves may be fit with a static model. At each 
wavelength, we subtract the last datum from the first 
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to estimate the effect of changing cloud cover. We then 
correct the final 6 observations according to this change: 
the 19**^ observation is unaffected, and the correction in- 
creases linearly from the 20**^ to the 25*^ observation. 
After the application of this correction, the last obser- 
vation is identical to the first, by construction. In order 
to avoid giving this point too much weight, we remove 
the final datum, resulting in 24 observations populating 
a 7-dimensional color space (Figure [1]) . 
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Fig. 1. — Changes in the apparent albedo of Earth in seven 
wavebands spanning near-UV to near-IR over 24 hrs, as ob- 
served by the Deep Impact spacecraft as part of the EPOXI 
mission on Jun 4, 2008. The black points show the data with 
our adopted uncertainties. The green lines show our best-fit 
model with 9 longitudinal slices and 3 surface types. 



2.1. Principal Component Analysis 

We center the data by subtracting the planet's time- 
averaged color and then perform principal compo- 
nent analysis (PCA) usi ng the covariance of the data 
(|Cowan et al.l 120091 12011D . The first two principal com- 
ponents have nearly equal power and account for 99% of 
the color variance in our data (90% of the variability). 

There is no one-to-one correspondence between sur- 
face types and principal components, however: the latter 
are normalized and orthogonal and therefore unphysical 
(jCowan et al.|[20Tll) . Furthermore, the number of domi- 
nant principal components does not even correspond to 
the minimum number of surfaces for reflec ted light sur- 
face retrieval, rather, ngurf ^ '^dom + 1 (jCowan et al.l 
120111) . Nevertheless, it is useful to reduce the dimension- 
ality of the data by projecting them onto the principal 
component plane (PCP). In the present case, the PCP is 
defined by the first two principal components (Figure [2]); 
in general it is a hyper-plane. 

2.2. Simplex Shrink- Wrapping 

Insofar as the color excursions of Earth lie in the PCP, 
then so must its dominant surfaces. Since three point 
uniquely define a plane, Occam's razor dictates that we 



try a three-surface solution. If one assumes that differ- 
ent surfaces and regions combine linearly to determine 
the planet's over-all colors, then the three pure surface 
spectra must define a triangle that encloses the data. 

Estimating broadband endmember spectra from our lo- 
cus of data is isomorphic to unsupervis ed spectral unmix- 
ing of multi-sp ectral satellite images (ISabol et al.lll99l 
lKeshava1l2003[ ). In traditional remote sensing, the data 
consists of colors at each pixel in a spatially-resolved im- 
age; in the present exoplanet application, the data con- 
sist of colors at each point in time, where there is a non- 
trivial convolution tying the time-series to spatial inho- 
mogeneities on the surface of the planet. 

We adop t a remote sensing algorithm: simplex shrink- 
wrapping (iFuhrmann 1 119991 ) . Recall that a simplex in 
A^-dimensional space is a polyhedron with -f 1 vertices: 
on a line the simplex is a line segment, on a plane the 
simplex is a triangle, in three-dimensions the simplex is 
a tetrahedron, etc. 

Simplex shrink-wrapping entails finding the simplex 
with the smallest volume (length in ID; area in 2D) that 
encloses all the data. Since only the most extreme col- 
ors constrain the shrink-wrapping, it is expedient to first 
determine the convex hull of the locus in the PCP (the 
dotted black line in left panel of Figure ^ . This step 
is not computationally necessary for our small (24 x 7) 
data matrix, but may be necessary for longer observa- 
tions and/or higher spectral resolution. 

The apparent albedo can be modeled as the matrix 
product of surface spectra, 5, and apparent covering frac- 
tion, /*: A*[nt,nx] = /*[nt,nsurf] x S[nsuri,nx], where 
ut is the number of observations and nx is the number 
of wavebands. Insofar as the observations do not lie per- 
fectly in the principal component plane, this model can- 
not perfectly match the data. Instead, the object is to 
match the projected albedo in the PCP. Simplex shrink- 
wrapping amounts to solving for /* and S given A* and 
the constraint of minimum volume. The vertices of our 
shrink-wrapped triangle are denoted by triangles in Fig- 
ure [21 The shrink-wrap endmembers are more likely to 
be unique if the data locus is non-spherical. If one knew 
nothing about the planet's orbital plane or viewing ge- 
ometry, endmembers would be the most conservative es- 
timate of surface spectra. 

The apparent covering fraction is related to the 
planetary covering fraction, /, by the convolution: 
/*[nt,nsurf] = W^["t, "•slice] X /Kiiccrisurf], where 7isiico 
is the number of longitudinal slices on the planet. The 
weight, W, quantifies the visibility and illumination of 
a given longitudinal slice on the planet at a given point 
in time. It is computed given the known sub-observer 
and sub-stella r positions at the time of each observation 
(jCowan et al.ir20 11). Although we adopt a spatial resolu- 
tion of risiice = 9, the W and / arrays are oversampled by 
a factor of 100 in order to reduce numerical integration 
error. 

Since is a known low-pass filter, acceptable /* (be- 
tween zero and unity, and summing to unity at each point 
in time) may require unphysical /. Indeed, if we adopt 
the shrink-wrap endmember spectra as bona fide surface 
spectra, the deconvolution, /* ^ /, produces surface 
maps with covering fractions greater than unity, or neg- 
ative. Since there is no possible surface map of the end- 
members that matches the data, the endmembers must 
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Fig. 2. — Principal Component Plane. Left: The black crosses connected by the solid black line show the EPOXI data projected 
onto the principal component plane. The dotted black line shows the convex hull of the observations. The colored triangles 
show the endmember spectra obtained by shrink-wrapping a simplex (a triangle, in this case) onto the convex hull. The green 
line shows the color excursions of our best-fit model obtained by rotational unmixing. Right: The black points are the EPOXI 
data after zooming out. The colored triangles are the same endmember spectra as on the left. The colored squares with error 
bars are the surface spectra retrieved from the data by rotational unmixing. The colored asterisks show predicted spectra of 
ocean, land and cloud combined with Rayleigh scattering (Robinson ct al. 2011; Cowan ct al. 2011). 



not correspond to the colors of actual surfaces. In other 
words, the shrink-wrap produces albedo spectra that are 
too conservative, and geographies that are too liberal. 

2.3. Rotational Unmixing 

The essential difference between the present applica- 
tion and traditional spectral unmixing is that while the 
surface coverage, and hence colors, of a pixel may be ar- 
bitrarily different from that of its neighbor, the colors of 
a disk-integrated planet cannot change arbitrarily fast. 

The third and final step of our analysis, after PC A 
and shrink-wrapping, is to solve for surface spectra, S, 
and geographies, /, given A* and W: A*[nt,nx] = 
W[nt,nsiicc] X /[nsiico,nsiirf] x S'[nsurf, "a]- This amounts 
to fitting the path of the planet in color-space by simulta- 
neously varying the three surface spectra and their plan- 
etary geography. While simplex shrink-wrapping only 
uses the convex hull of the data, rotational unmixing 
makes use of all the observations. 

We use a 3,000,000-step Markov Chain Monte Carlo 
(MCMC) to deter mine the be st parameters and their 
uncertainties (For d et al.l l200l[ ). We initialize the sur- 
face spectra at the endmember spectra from the shrink- 
wrapping to speed up convergence, and begin with a uni- 
form surface map with equal amounts of the three sur- 
faces. Before evaluating a step in the MCMC, we ensure 
that the surface spectra have albedos between zero and 
unity at all wavelengthf0, and that the covering frac- 
tions are between zero and unity at every location on 
the planet. 

The MCMC sequentially takes seven ID steps along 
each principal color for each surface spectrum (risurf x 
nx = 21 steps), and a 3D step in map space for each 
longitudinal slice of the planet, normalized such that the 
sum of covering fractions is unity at each slice (Usiicc = 9 
steps). Ste p sizes ar e tuned to get an acceptance ratio 
of - 0.25 ((Gelman|[2003[) . We adopt Gaussian uncer- 
tainties of 0.0033 on the apparent albedo measurements, 

^ Note that in practice it may be difficult to measure the absolute 
albedo of directly imaged terrestrial planets. 



which produce a best-fit reduce d of unity (essen tially 
the same 1% uncertainties as in lCowan et al.ll2009| ). 

The surface spectra are not constrained to lie in the 
PCP, but the best fit solutions are close to the plane, as 
expected from PCA. Quantitatively, the surface spectra 
have 5-dimensional Euclidean distances from the princi- 
pal component plane of 0.008, 0.007, and 0.009, for an 
aspect ratio of ^ 2% (cf. r ight p anel of Figure [2|). 

Following iCowan et al.l ([lOOw) and as dictated by the 
orbital phase, we use 9 longitudinal slices on the planet 
(40° spatial resolution, with fixed longitudes for the 
slices), convolving the surface map with the instanta- 
neous illumination and visibility functions assuming dif- 
fuse (Lambertian) reflection and known viewing geome- 
try (rotation rate, obliquity, orbital and seasonal phases). 
Since the covering fractions for the three surfaces must 
sum to unity at each location on the planet, the model 
has (risurf — l)?isiicc + nsuttnx = 39 free parameters. 

The green lines in Figure [T] show the lightcurves for our 
best-fit model. The green line in the left panel of Fig- 
ure [2] shows the centered color variations of the model 
projected on the PCP. The gray, red, and blue colored 
squares with error bars in Figure [2] denote our best-fit 
surface spectra projected onto the PCP. The three spec- 
tra correspond roughly to clouds, land, and ocean; their 
deprojected albedo spectra are shown in Figure [3l 

In order to gauge the accuracy of the retrieval, we take 
empi rical reflectance spectr a of clouds, ocean and land 
from iRobinson et al.l ()201lD and combine them with an 
empirical model of disk-integrated Rayleigh scattering 
(jCowan et al.l 120 lH) . The predicted surface spectra are 
the colored asterisks in Figure [21 and the colored dashed 
lines in Figure [3l 

The qualitative agreement between the retrieved and 
predicted surface spectra is remarkable when one consid- 
ers that the surface spectra are moving targets. Ocean, 
land and clouds on Earth are hardly uniform, and the 
path-length of Rayleigh scattering is a function of loca- 
tion on the disk of the planet. We have incorporated 
Rayleigh scattering in a simple way and have made no 
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Fig. 3. — Surface spectra of Earth as determined by the 
EPOXI observations. The solid colored lines show the best-fit 
surface spectra obtained by rotational unmixing; the colored 
dashed lines are the predicted spectra. The colors are the 
same as in Figure [2] and correspond to clouds (gray) , ocean 
(blue), and land (red) viewed through a Rayleigh scattering 
atmosphere. The dotted black line shows the time-averaged 
broadband albedo spectrum of Earth. 

attempt to account for water vap or absorption at 950 nm 
or any other atmospheric effects (|Shaw fc Burke |[2003[ ). 




Fig. 4. — Longitudinal maps of covering fraction, with ver- 
tical thickness of the lines corresponding to la uncertainty. 
The colors are the same as in Figure [2] and correspond to 
clouds (gray), ocean (blue), and land (red) viewed through a 
Rayleigh scattering atmosphere. 

The longitudinal maps of the three model surfaces 
(Figure H]) are not a good match to a cloud-free map 
of Earth. This is not surprising given our reliance on 
data from a single day, and the prevalence of obscuring 
clouds. The large land content retrieved for the mid- 
dle of the Pacific Ocean is spurious and may be due to 
our correction for changeable cloud cover (the observa- 
tions begin and end with the spacecraft over the Pacific) . 
The right panel of Figure [2] hints at why the rotational 
map of Principal Component 1 was remarkably faithful 
to the actual surf ace geography of Earth (Figure 10 of 
ICowan et al.ll2009l ): the land spectrum is pure PClQ As 
noted in this paper, however, there is no reason to believe 



* It should be noted, however, that ICowan et al.l I I2009I ) per- 
formed PCA simultaneously on both Earthl and EarthS time se- 



that "maps" of principal components should accurately 
reflect physical conditions on the planet. 

3. DISCUSSION 

The surface retrieval scheme presented here should be 
generally applicable, provided sufficient Euclidean dis- 
tance between surfaces in color space (surfaces should 
look different), and large-amplitude variations in appar- 
ent covering fractions (large, longitudinally distinct geo- 
graphical features). There are a number of well-justified 
assumptions made in the present work that should even- 
tually be relaxed, however: 

1) We assume Lambertian reflection for the purposes 
of convolving the planetary map of covering fractions 
with the visibility and illumination to obtain appar- 
ent covering fractions. This assumption should be 
correct for the present data since they were obtained 
with Earth at slightly gibbo us phase (Williams fc GaidosI 
120081 : iRobinson et al.ll20ldl) . In order to properly inter- 
pret data at crescent phases, however, it may be neces- 
sary to relax this assumption. 

2) We adopt the known viewing geometry of Earth at the 
time of the observations. Any observations able to mea- 
sure the rotational color- variations of an exoplanet would 
be more than adequate for estimating orbital phase and 
rotation period. The planetary obliquity and seasonal 
phase will not be known a priori, but simulations of full- 
orbit multiband lightcurves indicate t hat these geometri- 
cal parameters should be retrievable (jKawahara fc Fuml 
IMfi [20ll iFilTii fc Kawahara 20ll . 

3) We assume a static surface map for the planet, but 
in order to properly interpret weeks-months of data it 
would be imperative to allow the cloud cover to evolve 
in time. Clouds completely obscure the underlying sur- 
face in our linear model. For example, 33% cloud cov- 
erage at a given location means that that one-third of 
that pixel is covered in completely impenetrable clouds, 
while the remaining 67% is perfectly cloud-free. Given 
this parametrization, changes in cloud cover necessarily 
involve changes in ocean and land coverage, and a signif- 
icant increase in model complexity, all else being equal. 

4) We assume that clouds combine linearly with actual 
surfaces. Optically thin clouds, however, obscure un- 
derlying surfaces while contri buting to the refle ctance 
spectrum, a non-linear effect (jSabol et al.l |1992() . Nu- 
merical experiments indicate that the dimensionality of 
the apparent albedo locus is s till nsurf — 1 foi' re alistic, 
non- linear, radiative transfer (jCowan et al.l[20Tll l. Our 
experiments with a simple non-linear three-surface toy 
model further indicate that the apparent albedo locus 
is amenable to simplex shrink-wrapping, though none of 
the endmembers may correspond to a pure cloud spec- 
trum (Figure [5|). In addition to being physically mo- 
tivated, a non-linear cloud model would allow surface 
covering fractions to remain fixed despite changing cloud 
cover, enabling accurate surface maps with data span- 
ning many planetary rotations. A major challenge with 
this approach would be determining, a priori, which sur- 
faces should combine convexly and which should not. 

5) Although in the present case the surface spectra 
were allowed to leave the principal component plane, it 
may be computationally necessary to constrain them to 
the PCP for larger nx and/or rigurf- 
6) We adopt three surfaces because the power spectrum 
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Fig. 5. — Non-linear 3-surface toy model. The blue circles 
show the pure surface spectra: land (0.0, 0.4), ocean (0.1, 0.1), 
and cloud (0.6, 0.6). The red circles show the colors of indi- 
vidual pixels, while green circles show the planet's appar- 
ent albedo path over a single planetary rotation. The cover- 
ing fraction of land and ocean must sum to unity for every 
pixel, but for clouds we use a simple parameterization that 
captures the non-linear behavior of atmospheric scattering, 
A = 1 - (1 - Aloud) (1 - Aaurf) , where ^cio ud and Aaurf 
are t he cloud and surface albedos, respectively l|Cowan et al.l 
l2011f ). Note that the pixel colors do not all lie within the 
triangle defined by the surface spectra, but are still amenable 
to endmember analysis. 



of the albedo variations in dominated by the first two 
principal components. It is possible, however, that Earth 
has four or more surfaces that all happen to lie very close 
to the PCP. These surfaces might only betray themselves 



at higher spectral resolutions. Putting aside that patho- 
logical case, it is plausible that some terrestrial exoplan- 
ets will have more than three major surface/cloud types, 
leading to a higher-dimensional locus in color space. This 
would make the convex-hull and shrink-wrapping more 
computationally intensive, but existing algor ithms are 
routi i iely applied to higher dimensional data (jKeshava I 
120031: ISiiaw fc Burke I [2003D . It is likely that the mor- 
phology of color variations in a higher-dimensional color 
space will still provide enough leverage to identify surface 
spectra, as was the case in the present study. 

4. CONCLUSIONS 

Our three-step surface-retrieval scheme consists of 1) 
performing principal component analysis on the multi- 
spectral reflectance matrix and projecting the data onto 
the principal component plane, 2) shrink-wrapping a sim- 
plex onto the projected data, and 3) relaxing the simplex 
vertices in order to match the time-variations in disk- 
integrated color. 

Although we developed the method for directly-imaged 
terrestrial planets, there is no reason this method could 
not be used to determine the colors of clouds on directly- 
imaged gas giants, or the colors of albedo markings on 
unresolved Solar System bodies. 

W.M. Farr, B.W. Heumann, and N.A. Kaib provided 
valuable comments on early versions of this work. The 
EPOXI Mission of Opportunity is supported by the 
NASA Discovery Program. The Deep Impact spacecraft 
was built by Ball Aerospace. The Earth observations, 
acquired as part of the EPOCh component of EPOXI, 
are available through MAST. 
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